A General-Purpose Provenance Library
نویسندگان
چکیده
Most provenance capture takes place inside particular tools – a workflow engine, a database, an operating system, or an application. However, most users have an existing toolset – a collection of different tools that work well for their needs and with which they are comfortable. Currently, such users have limited ability to collect provenance without disrupting their work and changing environments, which most users are hesitant to do. Even users who are willing to adopt new tools, may realize limited benefit from provenance in those tools if they do not integrate with their entire environment, which may include multiple languages and frameworks. We present the Core Provenance Library (CPL), a portable, multi-lingual library that application programmers can easily incorporate into a variety of tools to collect and integrate provenance. Although the manual instrumentation adds extra work for application programmers, we show that in most cases, the work is minimal, and the resulting system solves several problems that plague more constrained provenance collection systems.
منابع مشابه
Language-integrated provenance in Haskell
Scientific progress increasingly depends on data management, particularly to clean and curate data so that it can be systematically analyzed and reused. A wealth of techniques for managing and curating data (and its provenance) have been proposed, largely in the database community. In particular, a number of influential papers have proposed collecting provenance information explaining where a p...
متن کاملReport From the CoalFace: Lessons Learnt Building A General-Purpose Always-On Provenance System
Over the past year we have implemented OPUS, an always-on system for observed provenance capture in user-space. In this paper we present some important lessons for anyone hoping to implement a general purpose provenance system operating at user-level. In particular, we highlight the problems and solutions associated with the explosion of interposition requirements attributable to function varia...
متن کاملToward a Theory of Self-explaining Computation
Provenance techniques aim to increase the reliability of human judgments about data by making its origin and derivation process explicit. Originally motivated by the needs of scientific databases and scientific computation, provenance has also become a major issue for business and government data on the Web. However, so far provenance has been studied only in relatively restrictive settings: ty...
متن کاملSPROV 2.0: A Highly-Configurable Platform-Independent Library for Secure Provenance
Data provenance allows us to explore the lineage and derivation history of data objects. As data and its provenance flow between people and tasks in potentially untrusted environments, it becomes essential to provide integrity and confidentiality assurances for provenance. Any solution also needs to be efficient, modular, and easy to deploy. In this poster and demonstration proposal, we discuss...
متن کاملInspector: A Data Provenance Library for Multithreaded Programs
Data provenance strives for explaining how the computation was performed by recording a trace of the execution. The provenance trace is useful across a widerange of workflows to improve the dependability, security, and efficiency of software systems. In this paper, we present INSPECTOR, a POSIX-compliant data provenance library for shared-memory multithreaded programs. The INSPECTOR library is ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012